Maximum Margin Dirichlet Process Mixtures for Clustering
نویسندگان
چکیده
The Dirichlet process mixtures (DPM) can automatically infer the model complexity from data. Hence it has attracted significant attention recently, and is widely used for model selection and clustering. As a generative model, it generally requires prior base distribution to learn component parameters by maximizing posterior probability. In contrast, discriminative classifiers model the conditional probability directly, and have yielded better results than generative classifiers. In this paper, we propose a maximum margin Dirichlet process mixture for clustering, which is different from the traditional DPM for parameter modeling. Our model takes a discriminative clustering approach, by maximizing a conditional likelihood to estimate parameters. In particular, we take a EM-like algorithm by leveraging Gibbs sampling algorithm for inference, which in turn can be perfectly embedded in the online maximum margin learning procedure to update model parameters. We test our model and show comparative results over the traditional DPM and other nonparametric clustering approaches.
منابع مشابه
Robust Bayesian Max-Margin Clustering
We present max-margin Bayesian clustering (BMC), a general and robust framework that incorporates the max-margin criterion into Bayesian clustering models, as well as two concrete models of BMC to demonstrate its flexibility and effectiveness in dealing with different clustering tasks. The Dirichlet process max-margin Gaussian mixture is a nonparametric Bayesian clustering model that relaxes th...
متن کاملSmall-Variance Asymptotics for Dirichlet Process Mixtures of SVMs
Infinite SVM (iSVM) is a Dirichlet process (DP) mixture of large-margin classifiers. Though flexible in learning nonlinear classifiers and discovering latent clustering structures, iSVM has a difficult inference task and existing methods could hinder its applicability to large-scale problems. This paper presents a smallvariance asymptotic analysis to derive a simple and efficient algorithm, whi...
متن کاملDensity Modeling and Clustering Using Dirichlet Diffusion Trees
I introduce a family of prior distributions over multivariate distributions, based on the use of a “Dirichlet diffusion tree” to generate exchangeable data sets. These priors can be viewed as generalizations of Dirichlet processes and of Dirichlet process mixtures, but unlike simple mixtures, they can capture the hierarchical structure present in many distributions, by means of the latent diffu...
متن کاملDirichlet Process Parsimonious Mixtures for clustering
The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtur...
متن کاملRevisiting k-means: New Algorithms via Bayesian Nonparametrics
Bayesian models offer great flexibility for clustering applications—Bayesian nonparametrics can be used for modeling infinite mixtures, and hierarchical Bayesian models can be utilized for sharing clusters across multiple data sets. For the most part, such flexibility is lacking in classical clustering methods such as k-means. In this paper, we revisit the k-means clustering algorithm from a Ba...
متن کامل